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Abstract 

Wc propose the formula for the number of pairs of consecutive primes Pn,Pn+i < x 
separated by gap d = Pn+i —Pn expressed directly by the number of all primes < x, i.e. by 
7r(x). As the application of this formula we formulate 7 conjectures, among others for the 
maximal gap between two consecutive primes smaller than x, for the generalized Brim's 
constants and the first occurrence of a given gap d. Also the leading term log log (z) in 
the prime harmonic sum is reproduced from our guesses correctly. These conjectures are 
supported by the computer data. 
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"A subject that has attracted attention 
but concerning which the known results 
leave much to be desired, 
is that of the bahaviour of pn+i — Pn, 
where pn denotes the n-th prime. " 
H. Davenport, in [HI p. 173] 



1 Introduction. 

In 1922 G. H. Hardy and J.E. Littlewood in the famous paper [18] have proposed 15 conjectures. 
The conjecture B of their paper states: 

There are infinitely many primes pairs {p,p'), where p' = p + d, for every even d. If Hci{x) 
denotes the number of prime pairs differing by d and less than x, then 

Alp-2 1og'(x) 

Here the product is over odd divisors p > 3 of d and we use the notation f{x) ~ g{x) in the 
usual sense: limx^oo f{x)/g{x) = 1. The twin constant C2 = 2c2 is defined by the infinite 
product: 

^2 = 2c2 = 2 TT ( 1 - - — I = 1.32032363169 ... (2) 

Computer results of the search for pairs of primes separated by a distance d < 512 and 
smaller than x for x = 2^^, 2^^, . . . , 2^^ f» 1.76 x 10^^ are shown in Figjl]and they provide a firm 
support in favor of ([T]). Characteristic oscillating pattern of points is caused by the product 

®w= n !^ (3) 

p\d,p>2 ^ 

appearmg m 0. The main period of oscillations is 6 = 2 x 3 with overimposed higher harmonics 
30 = 2 X 3 X 5 and 210 = 2x3x5x7, i.e. when & {d) has local maxima (local minima are 1 and 
they correspond to d = 2™). The red lines present TTd{x)/&{d) and they are perfect straight 
lines C2X/ log^(a;). 

There is large evidence both analytical and experimental in favor of ([T]). Besides the original 
circle method used by Hardy and Littlewood there appeared papers [36j and [38j where 
other heuristic arguments were presented. Even the particular case of d = 2 corresponding to 
the famous problem of existence of infinitely many twin primes is not solved. In May 2004, in 
a preprint publication [2] Arenstorf attempted to prove that there are infinitely many twins. 
However shortly after an error in the proof was pointed out by Tenenbaum [Hj. For recent 
progress in the direction of the proof of the infinity of twins see |23j . 

The above notation 7id{x) denotes prime pairs not necessarily successive. Not much is 
known about gaps between consecutive primes, which seems to be more interesting and difficult 
than the case of pairs of arbitrary (not consecutive) primes treated by the Hardy-Littlewood 
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Figure 1: The plot of vr^(x) (eq. (jlj)) obtained from the computer search for c? = 2, 4, . . . , 512 
and for x = 2^^, 2^"^, . . . , 2^^. In red are the ratios Hci{x)/&{d) plotted showing explicitly that a 
characteristic oscillating pattern with peaks at 6k, 30k, 210k is caused by the product (3{d). 



conjecture B. Let Tfi{x) denote a number of pairs of consecutive primes smaller than a given 
bound X and separated by d: 



For odd d we supplement this definition by putting r2fc+i(x) = 0. The pairs of primes separated 
by d = 2 and d = 4 are special as they always have to be consecutive primes (with the exception 
of the pair (3,7) containing 5 in the middle). In this paper we will present simple heuristic 
reasoning leading to the formula for Tii{x) expressed directly by tt{x) — the total number of 
primes up to x. 

A few main questions related to the problem of gaps dn = Pn+i — Pn between consecutive 
primes can be distinguished. There are gaps of arbitrary length between primes: namely the n 
numbers (n + 1)! + 2, (ra + 1)! + 3, (n + 1)! + 4, . . . , (n + 1)! + n + 1 are all composite. But it 
is not known whether for every even d there exists a pair of consecutive primes Pn+i — Pn with 
d = Pn+i —Pn- The growth rate of the form d^ = 0{p^) with different 6 was proved in the past. 
A few results with 6 closest to 1/2 are the results of: C. Mozzochi [29j 9 = S. Lou and Q. 




Td{x) = {number of pairs Pn,Pn+i < x, with d = pn+i - Pn}- 



(4) 
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Yao obtained 9 = 6/11 [25], R.C Baker and G. Harman have improved it to 6' = 0.535 [3] and 
recently R.C. Baker G. Harman and J. Pintz |1] have improved it by 0.01 to = 21/40 = 0.525 
which currently remains the best unconditional result. For a review of results on 6 see [35] . 
The Riemann Hypothesis implies dn = C(^/p^log(p„)) and 6* = ^ + e for any e > 0. On the 
other hand the second question about dn concerns the existence of very large gaps. Let G{x) 
denotes the largest gap between consecutive primes below a given bound x: 

G{x) = max (p„ - Pn-i)- (5) 

For this function lower bounds are searched for: G{x) > f{x). The Prime Number Theorem 
TT^x) ~ x/\og{x) trivially gives G{x) > log(x). Better inequality 

^^^^ y jce'^ + o(l)) log(a^) loglog(x) log log loglog(x) 
~ (log log log(a;))^ ' 

where 7 = 0.577216 ... is the Euler-Mascheroni constant, was proved by H. Maier and C. 
Pomerance in [27] with c = 1.31256 . . . and improved by J. Pintz to c = 2 in [?]. 

In the last few years a team of D. A. Goldston, J. Pintz, and C. Y. Yildirim has published a 
series of papers marking the breakthrough in some problems concerned with the prime numbers, 
for a review see [13], [35]. Among the results obtained by them are the following related to the 
subject of this paper: 

liminf ^"+^~^" =0 (7) 

n^oo logp„ 

and under appropriate unproved conjectures they also showed that there are infinitely many 
primes pn,Pn+i such that: 

Pn+l - Pn < 16. (8) 

In 1946 there appeared a paper [TT], where the problem of different patterns of pairs, triplets 
etc. of primes was treated by the probabilistic methods. In particular the formula for a number 
of primes< x and separated by a gap d was deduced on p. 57 from probabilistic arguments. 

In 1974 there appeared a paper by Brent [6], where statistical properties of the distribution 
of gaps between consecutive primes were studied both theoretically and numerically. Brent had 
applied the inclusion-exclusion principle and obtained from ([T]) a formula for the number of 
consecutive prime pairs less than x separated by d. But his result (formula (4) in [B]) does not 
have a closed form and he had to produce on the computer a table of constants appearing in 
his formula (4). The attempt to estimate these sums and to write a closed formula for them 
was undertaken in [32]. However in this paper we will present a completely different approach 
to the problem of prime gaps. 

The paper is organized as follows: In Sect|2]we will present a formula for Tii(x). As applica- 
tions of this expression in Sections |3]and|4jwe will give a formula for G{x) and ^p^<xiPn~Pn-iy 
expressed directly by it(x) and compare it with available computer data. In Section [s] we will 
consider the sums of reciprocals of all consecutive primes separated by a gap d and propose 
a compact expression giving the values of these sums for d > 6. In Sect|6] we will derive 
from formulas obtained in Sect. [5] the Euler-Mertens dependence of the harmonic prime sum 
Sp<a; ~ loglog(x). Next, the heuristic formula for the first occurrence of a given gap be- 
tween consecutive primes is proposed in SectjT} In the last Sect. [8] a behaviour of the sequence 
y/Pn+i — \lPn is considered in connection with the Andrica conjecture [1]. 
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2 The basic conjecture 

We have collected during over a seven months long run of the computer program the values of 
Td{x) up to X = 2^*^ ^ 2.8147 x 10^^. During the computer search the data representing the 
function Td{x) were stored at values of x forming the geometrical progression with the ratio 2, 
i.e. at X = 2^^, 2^^, . . . ,2^'^,2'*^. Such a choice of the intermediate thresholds as powers of 2 
was determined by the employed computer program in which the primes were coded as bits. 



The data is available for downloading from http : //www. if t .uni .wroc .pl/~mwolf /gaps .zip 
The resulting curves are plotted in Figj2} 

In the plots of Td{x) in Fig|2]a lot of regularities can be observed. The pattern of points in 
Figj2] does not depend on x: for each x the arrangements of circles is the same, only the intercept 
increases and the slope decreases. Like in the case of TTd{x) the oscillations are described by the 
product &{d), see the inset in Fig. [2] There is also a possibility of plotting Tii{x) for a couple of 
values of (i as a function of x, but such a graph does not reveal regularities seen in Figj2} The 
fact that the points in Figj2] lie around the straight lines on the semi-logarithmic scale suggest 
for Td{x) the following 

Ansatz 1 : 

Tdix) = G{d)B{x)F\x), (9) 

where F{x) < 1 (because Td{x) decreases with d). 

The essential point of the presented below considerations consists in a possibility of deter- 
mining the unknown functions F{x) and B{x) by assuming only the above exponential decrease 
of Td{x) with d and employing two identities fulfilled by r^(x) just by definition. First of all, the 
number of all gaps is smaller by 2 than the number of all primes smaller than N: 

G(x) 

J2 l = 5^r,(x) = 7r(x)-2, (10) 

(Pn-Pn-l), Pn<X d=2 

where 7t{x) denotes the number of primes smaller than x and G{x) is the largest gap below x 
and which was defined in (|5]). The second selfconsistency condition comes from an observation 
that the sum of differences between consecutive primes Pn x is equal to the largest prime 
< X (minus 3 coming from the distance to p2 = 3) and for large x we can write: 

Gix) 

"^(Pn-Pn^l) ^^^dTdix) ^ X. (11) 

Pn<X d=2 



Writing here x instead of the 



argest prime < x leads in the worst case to the error of the 



order G{x) ~ log^(x), see Sect, 3] The erratic behavior of the product &{d) is an obstacle in 
calculation of the above sums (10) and (11). We will replace the product &{d) in the sums by 



its average value. In [5j E. Bombieri and H. Davenport have proved that: 

p — 1 n 



E n 



p-2 np>2(i 



+ Oi\og\n)); (12) 



=lp|fc,p>2^ llp>2V- (p_l)2 



i.e. in the limit n — t- oo the number l/J^^^gll ~ {p-i)'^ ) arithmetical average of the 

product Y\p\k Thus we will assume that for functions f{k) going to zero like const~^ the 
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Figure 2: Plots of Td{x) for x = 2^"^, 2^^, . . . , 2"^^, 2*^^. In the inset plots of Td{x)/&{d) are shown 
for the same set of x. In red are exponential fits a(x)e~'^*'-^^ plotted. 
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following identity holds: 



E n ^ /(*) 



fc=l p\k,p>2 



p-2 



n 



p>2\ 



(p-l)2^ k=l 



(13) 



We extend in (10) and (11) the summations to infinity and using the Ansatz ^ and (13) we 
get the geometrical and differentiated geometrical series. For odd d we have defined T2k+i{x) = 
0. Then, writing d = 2k we obtain: 



OO 

d=2 



y^^dTd{x) 



d=2 



Bjx] 

C2 

2B{x) 

C2 



OO 

k=l 



Y,kF^\x) 

k=l 



2 B{x)F\x) 
^ 1 - F2(x) 

1 B{x)F\x) 
' ^(1-F2(x))2- 



(14) 



(15) 



By extending summations in (10) and (11) to infinity G{x) — t- oo we made an error of the 
order C(F(a;)^(^)+2) the first case and an error 0{G{x)F{x)'-^^^^^^) in the second equation, 
both going to zero for x — > oo, because for a; — )■ oo we have G{x) — )■ oo. That indeed 
0{G{x)F{x)'^^^^~^'^) goes to zero for a; — )■ oo can be checked a posteriori from the formulas for 
F{x) and for G{x) ~ log^(x), see Sect. |3} Thus we obtain two equations: 



1 B(x)F^(x) 



of which solutions are 

B(x 



C2 1-F2(x) 
2C27t\x) 



7l{x), 



1 2B{x)F\x) 

^(1-F2(x))2 



X 



:i- 



2tt{x) ' 



F^(x) 



X 



27t(x) 



X 



(16) 



(17) 



and a posteriori the inequality F{x) < 1 holds evidently. Finally, we state the main: 
Conjecture 1 

The function Td{x) is expressed directly by 7r(x): 



c TT ^~ ^ ^^(^) (i 

^ J-l p_2 X \ 



27r(x) 



2 ^ 



P\d,p>2 



X 



+ error term{x, d) for d > 6 (18) 



For Twins (rf = 2) and Cousins {d = 4) the identities T"2,4(a;) = vr2,4(a;) hold. Because d is even 
the power of (1 — 2tt{x)/x) has a finite number of terms. The formula (18) consists of three 



terms. The first one depends only on d, the second only on x, but the third term depends both 
on d and x. In the usual probabilistic approach one should obtain (1 — "^Y'^, see e.g. fU] , 
[l3l p. 3]: to have a pair of adjacent primes separated by d there have to be — 1 consecutive 
composite numbers in between and probability of such an event is (1 — 7[{x)/xY^^; then the 
term in front of it comes from the normalization condition. 

Although (18) is postulated for d >6, we get from it for d = 2: 
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Conjecture 2 



T2{x) = n2{x) = C2 
instead of the usual conjectures 



X 



_l_ ^^^Q^ term{x) 



X 



7T2{X) ~ C2 2( N 

log (x) 



or 



T^2[X) 



C2 



du 



\og\u) 



C2Li2(x). 



(19) 
(20) 
(21) 



Remark: The equation (19) expresses the intuitively obvious fact that the number of twins 
should be proportional to the square of 7r(x). Of course (19) for 7r(x) ~ x/log(x) goes into 
PI- 



We have checked with the available computer data that (19) is better than (20) but worse 
than (21). Because Li2(x) in (21) monotonically increases while there are local fluctuations in 
the density of primes and twins, the above formula (19) incorporates all irregularities in the 
distribution of primes into the formula for the number of twins. Since both d = 2 and d = 4 gaps 
are necessarily consecutive, we propose the identical expression (19) for t^{x) = 774(0;) ^ 'n'2{x), 
see 



It is possible to obtain another form of the formula for Td{x), more convenient for later 
applications. Namely, let us represent the function F{x) in the form: F(x) = e~'^^^\ i.e. now 
the Ansatz 1 has the form: 

Ansatz 1' 

Tdix) ~ 5(x)6(rf)e-^(^)^ (22) 

where A{x) is the slope of the lines plotted in red in Fig. |2]and as we can see A(x) goes to zero for 
x 00. In the equations (16) we use in the nominators the approximation e~'^^^^^ ^ 1 — 2A(x) 
and in the denominators 1 — e~^^^^^ 



2A{x) for small A{x) obtaining finally 



Conjecture 1' 



Co 



n\x) 



n 



p 



-diT{x)/x 



' X — 2t{{x) p — 2 

^ ' p\d,p>2 ^ 



_l_ Qffor term{x, d) for d > 6. 



(23) 



For large x we can skip 27c{x) in comparison with x in the denominator and obtain finally the 
following pleasant formula: 



Conjecture 1" 



Tdix 



X 



p\d,p>2 



P — 1 

f 

J9- 2 



-diT{x) /x 



+ error term{x, d) for > 6. 



(24) 



In equations (23) and (24) the term in the exponent has a simple interpretation: difference 
d is divided by the mean gap x/n{x) between consecutive primes. Because for small u an 
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approximation log(l — u) 
of conjecture 1': 



1 



■u holds, we can turn for large x the conjecture (18) to the form 

(25) 



Putting in (24) 7r{x 




d 

e2 



Next we see that for large x both (18) and (23) go into the conjecture (24). 

x/\og{x) and comparing with the original Hardy-Littlewood con- 



jecture we obtain that the number Tii{x) of successive primes (pn+i, Pn) smaller than x and of 
the difference d (= Pn+i — Pn) is diminished by the factor exp{—d/ log(a;)) in comparison with 
the number of all pairs of primes {p,p') apart in the distance d = p' — p: 



Tdix) ~ 7Td{x)e-'^/^°^^''^ for d>6. (26) 

Heuristically, this relation encodes in the series for e~'^^^°^^^^ the inclusion-exclusion principle 
for obtaining Td{x) from 7id{x). The above relation is confirmed by comparing the Figures 
|l| and |2| R.P. Brent in [6] using the inclusion-exclusion principle has obtained from the B 
conjecture of Hardy and Littlewood the formula for Td{x), which agrees very well with computer 
results. However the formula of Brent (eq.(4) in the paper [6]) is not of a closed form: it 
contains a double sequence of constants Ar^k, which can be calculated only by a direct use 
of the computer, what is very time consuming, see discussion of S. Herzog at the web site 
http://mac6.ma.psu.edu/primes. R. P. Brent in [6] in Table 2 compares the number of actual 
gaps = 2, . . . , 80 in the interval (10^, 10^) with the numbers predicted from his formula finding 
perfect agreement. Analogous method to determine the values of rrf(x) was employed in [321 
see eq.(2-8) and the preceding formula]. The formula (2-8) from [32] adapted to our notation 
has the form: 

exp {-d/ \og {u)) 
log^(u) 

Integrating the above integral once by parts gives a term xe~^^^"^^^^ / \og^{x) corresponding to 



r,(x) ~ C,G{d) I '''^'''-X d^- (27) 



(24) with 7r(a;) ~ a;/log(a;). 



It is not possible to guess an analytical form of error terms in formulas (18), (23) and 



(24) at present (let us remark that the error term in the twins conjectures (20) or (21) is not 
known even heuristically). The only way to obtain some information about the behaviour of 
error term{x,d) is to compare these conjectures with actual computer counts of Td{x). Of 



course, the best accuracy has the formula (18). We have compared it with generated by the 



computer actual values of Td{x) — i.e. we have looked at values of 

A(x,d)=r,(x)-C26(rf) ^fl-^^V • (28) 

X \ X I 

The values of A(x, d) were stored for 105 values oid = 2,A,..., 210(= 2 • 3 • 5 • 7) at the ar- 
guments X forming the geometrical progression Xk = 1000 x (1.03)^^. Additionally the values 
of |A(x,(i)| < 9 were stored to catch sign changes of A{x,d). It is difficult to present these 
data for all values of d. We have found that for some gaps d there was monotonic increase 
of A(x,d), for other gaps there were sign changes of the difference A{x,d), see Fig|3j For 30 
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Figure 3: Plots of A(x, d) on the double log- 
arithmic scale for d = 6,22,44,56,62,78. On 
the y axis we have plotted logio(A(a;, d)) if 
A{x,d) > and - logio(-A(x, (i)) if A{x,d) < 
0. 
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Figure 4: Plots of ratios of the values predicted 
from the Conjecture 1 to the real values of Td{x) 
for d = 6,22,44,56,62,78. The plots begin at 
such X that Td{x) > 1000 to avoid large initial 
fluctuations of these ratios (see initial parts of 
curves in the previous Figure). 



values of d of all 105 looked for we have found sign changes for x < 8 x 10^^. Surprising is the 
steep growth of A(x, d) for d = 44, 56, 78 (the same behaviour we have seen for other values of 
d) in the region of crossing the y = line. In fact, there were 76 sign changes of A(a;, 54), 109 
sign changes of A(x,56) and 207 sign changes of A(x, 78). The general rule is that the ratio 

Td{x)/C2G{d) ^(1 - ^)2-^ tends to 1, see Fig. Q Thus we formulate the 
Conjecture 3 

For every d there are infinitely many sign changes of the functions A(a;, d). For fixed d we 
guess 

Coniecturei 1/ vf*^, a;) , , 

lim /' ^ ' = 1. (29) 



We can test the conjecture (24) with available computer data plotting on one graph the 
scaled quantities: 



From the conjecture (24) we expect that the points (-D(x, d),Td{x)), d = 2,4, . . . , G{x) should 
coincide for each x — the function Td{x) displays scaling in the physical terminology. In Fig. [5 
we have plotted the points {D{x, d),Td{x)) for x = 2^^, 2^^, 2^^. If we denote u = D{x, d) then 
all these scaled functions should lie on the pure exponential decrease e~" (Poisson distribution, 
see [121 P-60]), shown in red in Fig. [sj We have determined by the least square method slopes 
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s{x) of the fits a{x)e to the hnear parts of 
in Fig. |6j The slopes very slowly tend to 1: fc 
1.136. 




4 8 12 16 

D(x, d) 



Figure 5: Plots of {D{x,d),Td{x)) for x = 
228^238,2^8 and in red the plot of e"". Only 
the points with Td{x) > 1000 were plotted to 
avoid fluctuations at large D{x,d). 



D{x, d), log(Td(x))). The results are presented 
over 6 orders of x they change from 1.187 to 

1.2 -1 
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Figure 6: Plot of slopes obtained from fitting 
straight lines to {D{d,x),\og(Td{x))) for x = 

028 o29 o48 



3 Maximal gap between consecutive primes 



From (18) or (24) we can obtain approximate formula for G{x) assuming that maximal differ- 
ence G{x) appears only once, so tg{x){x) = 1: simply the largest gap is equal to the value of d 
at which Td{x) touches the (i-axis on Figj2] Skipping the oscillating term &{d), which is very 
often close to 1, we get for G{x) the following estimation expressed directly by tt{x): 

Conjecture 4 

Gix) ~ gix) = -^{2\og{n{x)) - log(x) + c), (31) 

where c = log(C2) = 0.2778769 . . .. 

Remark: The above formula explicitly reveals the fact that the value of G{x) is connected 
with the number of primes n{x): more primes means smaller G{x). It is intuitively obvious: if 
we draw randomly from a set of natural numbers {1,2, ....N} some subset of different numbers 
ri,r2, ....Tfc and calculate differences 6i = rj+i — rj, then for larger k we will expect smaller 6k 
— more elements in the subset smaller gaps between them. 

For the Gauss approximation 7r{x) ~ x/log{x) the following dependence follows: 

G{x) ~ log(x) (log(x) - 2 log log(x) + c) (32) 
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and for large x it passes into the well known Cramer's pjj conjecture: 

G{x) ~ log^(x). (33) 



The examination of the formula (31) and the formula (33) with the available results of the 
computer search is given in FigjT} The lists of known maximal gaps between consecutive 
primes we have taken from our own computer search up to 2"^^ and larger from web sites 
www.trnicely.net and www.ieeta.pt/~tos/gaps.html. The largest known gap 1476 between 
consecutive primes follows the prime 1425172824437699411 = 1.42 . . . x 10^®. On these web 
sites tabulated values of n{x) can be also found and we have used them to plot the formula 



(31 ). Let i^aiT) denotes the number of sign changes of the difference G{x) — g{x) for 2 < a; < T. 
There are 33 sign changes of the difference G{x) — g{x) in the FigjT] and i^ciT) is presented in 
the inset in Fig. [7j The least square method gives for i^g{T) the equation 0.786 log(T) +0.569. 

There appeared in literature a few other formulas for G{x), see e.g. [10], [ID]; in particular 
D.R. Heath-Brown in [211 P- 74] gives the following formula: 

G{x) ~ log(x)(log(x) + logloglog(x)). (34) 



A. Granville argued [I6] that the actual G{x) can be larger than that given by (33), namely 
he claims that there are infinitely many pairs of primes for which: 

Pn+l -Pn = G{pn) > ^C'^ \og\pn) = 1.12292 . . . log2(p„)- (35) 



where 7 = 0.577216 ... is the Euler-Mascheroni constant. The estimation (35) follows from the 
inequalities proved by H.Maier in the paper [26], which put into doubts Cramer's ideas. For 
other contradiction between Cramer's model and the reality, see [34] • 



4 The Heath-Brown conjecture on the J2pn<xiPn — Pn-if 



As the application of the formula ( 18 ) we consider the conjecture made by D.R. Heath-Brown in 



|20j . Assuming the validity of the Riemann Hypothesis and the special form of the Montgomery 
conjecture on the pair correlation function of zeros of the C,{s) function, Heath-Brown has 
conjectured in this paper that 

^{Pn-Pn-if ~ 2a;log(x). (36) 

Pn<X 

Erdos conjectured that the r.h.s. should be const xlog^(x), see [TTJ bottom of p. 20]. From the 
guessed formula (18) we obtain the above sum expressed directly by 7r(x) (we have extended 
the summation over d up to infinity and used (13); then the dependence on C2 drops out): 

Conjecture 5 

Ex9 ,9 / N / 37r(x) 27r^(x) \ , 

Pri<X d=2,Afi,... ^ ^ " \ / 
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10^ 10^ 10^ lO'* 10^ 10® 10^ 10^ 10^ 10^° 10^M0^^ lO^Mo^'* 10^^ 10^® 10^M0^ 



X 

Figure 7: The comparison of G{x) and g{x) as well as of the Cramer conjecture. In the inset 
there is the plot of a number of crossings of curves representing G{x) and g{x). This figure 
should be compared with the figure on page 12 in |18]. 



For large x we can reduce the above formula to a simple form: 

Pn<X 



TTiX] 



(3^ 



which for 7r(x) ~ x/log(x) gives exactly (36). The above equation is intuitively obvious: the 
sum of squares of {pn — Pn-i) is proportional to x"^ and inversely proportional to n{x), because 
more primes means smaller differences p„ — Pn-i on average. The same formula 2x'^/it{x) is 



obtained from the conjecture (24) in the limit of large x. In the Table I the comparison between 



the predictions (36) and (37) and real computer data is shown. As it is seen from the column 3 



the convergence towards 1 of the ratio of the Heath-Brown to the real data is very slow, while 
the expression (37) predicts the actual numbers ^ <xiPn ~ Pn-iY better. 



In the past m the literature there were studied sums over large differences between consec- 
utive primes, see e.g. [22], [T3]. For example D. Goldston has proved assuming the Riemann 
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Hypothesis that 



(Pn-Pn-i) = o(^ 



Pn<X 

Pn-Pn-1>H 



X log(x) 



(39) 



uniformly for H > 1, while from (18) we get 



X 



2Tl{x)\H/2f (H-2)7t(x), , , 

1 —\ (1 + ^ '-^^\. (40) 



Pn-Pn-1>H 



d>H 



X 



X 



{x-2ti{x)) 

For i7 = 2 it gives the correct value x on the r.h.s. of the above equation. Putting in r.h.s of 



(40) 7r(x) = x/log(x) and expanding with respect to l/log(x) for large x we obtain: 

/ ^ HiH-2) X 1 



Pn<X 
Pn-Pn~l>H 



log (x) 



log^(x) 



(41) 



which for x so large that log(x) > H is indeed smaller than upper bound (39) of Goldston. In 
general, expressions for sums of the form YliH<d<K f^^) '^^^ obtained in closed form if the 
sums are differentiated geometrical series in d. 

TABLE I 

The sum of squares of gaps between consecutive primes. In the second column the numbers 



obtained by a computer are given, while in the third one values obtained from eq.(36) and in 



the fifth from eq.(37) are presented. The fourth and sixth columns contain the appropriate 

_ ratios. 



cq.||37[l 



X 



^Pn<xiPn Pn-lj 



eq.(36) 



eq.(37) 



)24 



444929861 



558195733 



0.7971 



488725881 



0.9104 



)26 



1959715561 



2418848443 



0.8102 



2141587523 



0.9151 



)28 



8565851937 



10419653325 



0.8221 



9313220996 



0.9198 



)30 



37168128501 



44655665552 



0.8323 



40239313423 



0.9237 



232- 
234- 
236- 



160316134721 

687851546609^ 

2938092559089 



190530845965 
80975609432(r 
3429555231277 



0.8414 
0.8495 
0.8567 



172900857995 
739353131559^ 
3148372990028 



0.9272 
0.9303 
0.9332 



)38 



12499933597193 



14480344308470 



0.8632 



13357112013493 



0.9358 



2^° 52993288896469 



60969870777867 



0.8692 



56482296752813 



0.9382 



i42 



223959886541173 



256073457287370 



0.8746 



238142313949083 



0.9404 



i44 



943825347126665 



1073069725777350 



0.8796 



1001414251864841 



0.9425 



i46 



3967383251021137 



4487382489617471 



0.8841 



4201009869963194 



0.9444 



i48 



16638404184530149 



18729944304492034 



0.8883 



17585360374792679 



0.9462 



5 Generalized Brun's constants 

In 1919 Brun [^9j has shown that the sum of the reciprocals of all twin primes is finite: 
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Sometimes 5 is included only once, but here we will adopt the above convention. The an- 



alytical formula for B2 is unknown and the sum (42) is called the Brun's constant [H]. The 
numerical estimations give [30] B2 = 1.90216058 . . .. Here we are going to generalize the above 
B2 to the sums of reciprocals of all consecutive primes separated by gap d and to propose a 
compact expression giving the values of these sums for d > 6. 



Let Td denote the set of consecutive primes separated by distance d: 

Td = {{pn+l,Pn) ■ Pn+1 - Pn = d} . 

We define the generalized Brun's constants by the formula: 



EVP- 

peTd 



(43) 



(44) 



We adopt the rule, that if a given gap d appears two times in a row: p„ — Pn-i = Pn+i — Pn, 
the corresponding middle prime Pn is counted two times (in the case of B2 only 5 appears two 
times); e.g. for d = 6 we have the terms ... + 1/47 + 1/53 + 1/53 + 1/59 + ... and next 
. . . + 1/151 + 1/157 + 1/157 + 1/163 + . . .. 

B.Segal has proved [39] that the sum in (44) is convergent for every d, thus generalized 



Brun's constants are finite. Because of that the sums (44) can be called Brun-Segal constants 
for d>2. 



Let us define partial (finite) sums: 



(45) 



p£Td,p<x 



We have computed on the computer quantities Bd{x) for x up to x = 2^^ ^ 7.037 x 10^^. 
The partial generalized Brun's constants Bd{x) were stored at a; = 2^^, 2^^, . . . ,2^^ and data is 
available for download from |http : //www. if t .uni . wroc .pl/~mwolf /Brun. zip In Fig. [8] we 



present a part of the obtained data. 

The dependence of B2{x) on x is usually (see [S], [7]) obtained by appealing to the con- 
jecture (20) (i.e. Hardy-Littlewood conjecture ([T]) for c? = 2). It gives that the probability 



to find a pair of twins in the vicinity of x is 2c2/log (x), so the expected value of the finite 
approximation to the Brun constant can be estimated as follows: 

1 „ . r du 
p 



B2{x) = B2\ 



00 — 



Bo - 4c2 



u log 



Bo 



u 



4C2 

log(x) 



(46) 



It means that the plot of finite approximations B2{x) to the original Brun constant is a linear 
function of 1/ log(a;). The same reasoning applies mutatis mutandis to the gap d = 4. 

To repeat the above reasoning for d = 2, 4 for larger d an analog of the Hardy-Littlewood 
conjecture for the pairs of consecutive primes separated by distance d is needed and we will use 



the form (24) for Td{x) (the integrals occurring below can be calculated analytically also for 



(18)). Putting in the equation ([24]) 7r{x) = x/\og{x) we obtain for c? > 6 : 

p 



Bd{x) = Bdioo) 



P&Td,p>x ■ 



Bd-4c2l[ 



p\d 



p 



u log 



du. 



u 



(47) 
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100 200 300 400 500 600 

d 



Figure 8: The plot of Bd{x) for x = 2^^ = 6.71... x 10^2^6 = 6.87... x W^X^ = 7.04... x 10^^ 
The fit a{x)e~''^^^'^/d to Bd{x)/&{d) obtained by the least square method is plotted in red. In 
the inset the values of a{x) as well as differences between conjectured slope — l/log(x) and 
actual fit b{x) are shown for x = 2^"^, 2^^, . . . , 2'^^. 



and the integral can be calculated explicitly: 

B,{x) = B,{oc) + ^ n ^ (^"'^'°'^^^ - 1) • (48) 

p\d ^ 

From this, it follows that the partial sums Bd{x) for d> 6 should depend linearly on e~'^^^°^^^^ 
instead of linear dependence on l/log(x) for B2{x) and B^^x). 

Because Bd{x) is for x = 1 (in fact each Bd{x) will be zero up to the first occurrence of 
the gap d, see Sect. [T]), we take in (48) the limit x and obtain 

Bd{oc) ^Bd=^T\ ^ for d>6. (49) 
d p — 2 

p\d ^ 
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Thus the formula expressing the x dependence of Bd{x) has the form: 

BJx) = ^ TT ^^e-'^/^"^^") + error termid, x). (50) 
p\d ^ 

The characteristic shape of the dependence of Bd{x)/&{d) on d is described by the relation 
log{Bd{x)/&{d)) ~ — log((i) — d/\og{x): if d/\og{x) > log{d) the linear dependence on d 
preponderates. In these linear parts we have fitted by least square method the dependence 
log(a(a;)) — db{x) to the actual values of \og{dBd{x)/2C2&{d)). We have obtained, that indeed 
b{x) tends to l/log(a:) and a{x) tends to 1 with increasing x, see the inset in Fig. [sj 



The comparison of the formula ( 49 ) with the values extrapolated from the partial approxi- 
mations Bd{2^^) by the formula 

Bdioc) = Bd{2'') + ^ n ^ (1 - e-'"'"^^^'^) (51) 

d p — 2 ^ ' 
v\d ^ 



obtained from the equation (48), is shown in Fig. |9j for (i > 6 — predicted by (49) values 
for (i = 2 and d = 4 are skipped. Because on average the product &{d) is equal to l/c2, we 
can write Bd ~ 4:/d. Let us mention that 4/d provides remarkably good approximations to 
B2 = 1.90216058 . . . and i34 = 1-19705 . . .. 

The outcome of the above analysis allow us to make the 

Conjecture 6 

Bd(oo) = Bd = —f- TT ^- + error term(d), for d > 6. (52) 

d p — 2 
p\d ^ 

The data shown in the inset in Fig|9] suggest that the error term should decrease with d. 



6 The Merten's Theorem on the prime harmonic sum. 

It is well known, that the sum of reciprocals of all primes smaller than x is given by |28], fl9[ 
Theorems 427 and 428], jlK] : 

5^- = log(log(a;)) + M + o(l); (53) 

p<x ^ 

here M = 0.2614972 ... is the Mertens constant and it has a few representations: 

00 

M = 5^(log(l - 1/p) + 1/p) =l + J2 /^(^) ^og{ak))/k, (54) 

p k=2 

where /i(n) is the Moebius function and C(s) is the Riemann zeta function. On the other 
hand, the sum J2p<x ^/P '^^^ ^e expressed by finite approximations to the generalized Brun's 
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0.1 



0.01 



BdM=Bd{2'')+-fU^n-e-^'4'^^'^^'')) 

d\p ^ 

□ □ □ ^Qnfi /d 




50 100 150 200 , 250 300 

a 



350 



400 



Figure 9: The plot of the generahzed Brun's constants Bd extrapolated from (51) marked by 



circles and predicted by (49) marked by squares. In the inset the ratio of the values obtained 



from these two equations is plotted. 



constants: 



p<x 



1 

2 



1 
6 



G(x) 



M' + 



d=2 p\d 



p — 1 
( 

p-2 



-d/ log(x) 



(55) 



Because each prime except 2 and 3 (hence the terms 1/2 and 1/6) appears as the right and left 
end of the adjacent pairs, we have to divide the sum by 1/2 (we remind that we have adopted in 
Sect|5]the convention that if a given gap d appears two times in a row: pn —Pn-i = Vn+i —Pn — 
the corresponding middle prime p„ is counted two times). We have introduced here the constant 



M' which accounts the sum of the unknown errors terms in (50) as well as incorporates the 



fact that the dependence of B2{x) and -B4(x) on x is not described by the formula (50) but by 



(46). The sum in (55) runs over even d and extends up to the greatest gap G{x) between two 



consecutive primes smaller than x. For G{x) we will use the Cramer's formula (33). To get rid 
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of the product &{d), we will make use of the (13) and we obtain: 



G{x) 



iG(x) 



q = e 



p<x ^ d=2 k=l 

Expanding log(l — q), where < g < 1, into the series we obtain 

j2lq' = -\ogil-q)+ f -^Ju. 
k=i Jo I 

For large x the term with logarithm goes into: 

log(l - e-2/^°s(^')) = -log(log(x)) + log(2) + C(l/log(x)). 

Now, by the weighted mean value theorem we calculate the integral: 



'2/log(x) 



(56) 



I 



-du 



1 



n+l 



k u-l {eq-l){n + l)' 

But q = exp(— 2/ log(x)) < 1 and: 



o<e <1. 



(57) 



(5^ 



(59) 



1 , 1 

< 



,2/log(x) 



Oq-l ' 1-q e2/i°sW - 1 



< 



log(x) 



.2/log(x) 



0{\og{x)). 



(60) 



For X ^ 1 we have on the virtue of the Cramer conjecture that in our case ^ ~ ^ log^(x), thus: 



0{l/x log(x)). 



Finally we obtain from (50) and (55): 



V - = log(log(a;)) + M' + l- log(2) + 0(l/log(x)) 

p<x 



(61) 



(62) 



Because 2/3 is practically equal to log(2) to require consistency with the Merten's theorem, we 
have to postulate that M' ^ M. The comparison of the Mertens estimation for Ylp<x ^/P with 
data obtained by a computer is shown in Fig, 10 By the separate run of the computer program 
we have checked that up to 1.4 x 10^^ there are almost 550000 sign changes of the difference 
Sp<x p ~ log(log(2^)) ~" the first sign change appears at 5, 788, 344, 558, 967. 



7 First occurrence of a given gap between consecutive 
primes 

In this section we will present the heuristical reasoning leading to the formula for the first 
appearance of a given gap of length d, see e.g. [21], [8], |17j, [31] . 
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I 1 1 iiiii| — I 1 1 — I 1 1 — I 1 1 — I 1 1 iiiiij — I 1 1 — I 1 1 iiiiij — I 1 1 — I 1 1 — I I mill 

10" 10' 10' 10' 10° 10' 10'° 10" 10'' 10" 10" 

X 

Figure 10: The plot of the prime harmonic sum up to x = 2^^, ...,2'^^ and the Merten's approx- 
imation to it. The original of this figure has y axis of the length 8 cm and spans the interval (2.5, 
3.8), so if the x axis would be plotted in the linear scale instead of logarithmic, then it should be 
5.33(3) X 10^ km long — that is the size of the Solar System. 



We will use the conjecture (50) to estimate the position of the first appearance of a pair of 



primes separated by a gap of the length d. More specifically, let: 

I minimal prime, such that the next prime p' = pf(d) + d 
I oo if there is no pair of primes Pn+i — Pn = d. 

It is not known whether gaps of arbitrary (even) length exist or not, in other words the answer 
to the question: Is it true that for every d there is Pf{d) < oo? is unknown [8j. 

We can obtain the heuristic formula for Pf{d) by remarking that the finite approximations 
to the generalized Brun's constants are for the first time different from zero at Pf{d) and then 
they are equal to 2/pf{d): 

Referring to the argument that on average &{d) is equal to l/c2, we skip &{d) and C2. Neglecting 
the log(2) = 0.69314. . ., we end up with the quadratic equation for t = \og{pf{d)): 

t"^ -tlog{d) -d = 



The positive solution of this equation gives: 
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Conjecture 7 

The comparison of this formula with the actual available data from the computer search is 



shown in Fig. [TTJ Most of the points plotted on this figure come from our own search up to 
2^*^ = 2.815 ... X 10^^. First occurrences Pf{d) > 2^^ we have taken from http: / /www. trnicely.net 
and http://www.ieeta.pt/~tos/gaps.html. In the Fig.ll there is also a plot of the conjecture 
made by Shanks [iO] : 

P/W~e^, (66) 



while from ( 65 ) for large d it follows that 

Pf{d) 



10" 



v^e^. 



(67) 




O O O p/d) 



1 — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — r 
100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 

d 



Figure 11: The plot of Pf{d) and approximation to it given by (65) and (66) 



8 The Andrica Conjecture 

In the last section we will make use of most of the conjectures formulated so far. The Andrica 
conjecture [T] (see also [T71 p. 21] and p. 191]) states that the inequality: 

An = y/Pn+1 - a/P^ < 1' (68) 
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where pn is the n-th prime number, holds for all n. Despite its simplicity it remains unproved. 
In Table II the values of An are sorted in descending order (it is beheved this order will persist 
forever) . 
We have 

/ / Pn+l Pn ^ dn / nr\\ 

^/Pn+l - y/Pn = — < ^ (69) 



y/Pn+l + - 

From this we see that the growth rate of the form dn 



'Pn 



O(p^) with e <l/2 will suffice for the 
proof of (68), but as we have mentioned in the Introduction, currently the best unconditional 
resuh is ^ = 21/40 [4J. 

For twins primes Pn+i = Pn^"^ there is no problem with (68) and in general for short gaps 
dn = Pn+i — Pn betwecu consecutive primes the inequality (68) will be satisfied. The Andrica 



conjecture can be violated only by extremely large gaps between consecutive primes. Let us 
denote the pair of primes < x comprising the largest gap G{x) by pl+i^x) and Pl{x), hence we 
have 

G{x) = PL+iix) - Pl{x). (70) 



Thus we will concentrate on the values of the difference appearing in (68) corresponding to the 
largest gaps and so let us introduce the function: 



Then we have: 



R{x) = Vp^^^kM - Vpl{x). 



An < R{Pn) 



(71) 



(72) 



TABLE II 



n 


Pn 


Pn+l 


dn 


\/Pn+l — \/Pn 


4 


7 


11 


4 


0.6708735 


30 


113 


127 


14 


0.6392819 


9 


23 


29 


6 


0.5893333 


6 


13 


17 


4 


0.5175544 


11 


31 


37 


6 


0.5149982 


2 


3 


5 


2 


0.5040172 


8 


19 


23 


4 


0.4369326 


15 


47 


53 


6 


0.4244553 


46 


199 


211 


12 


0.4191031 


34 


139 


149 


10 


0.4167295 



For a given gap d the largest value of the difference a/p + d — ^/p will appear at the first 
appearance of this gap: each next pair {p',p' + d) of consecutive primes separated by d will 
produce smaller difference (see ([69])): 



^p' + d - Vp' < ^/p + d - Vp- (73) 
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Hence, we have to focus our attention on the first occurrences Pf{d) of the gaps. Using the 
conjecture (67) we calculate 



Pf{d) + d- Jpfid) = y y/de^ + d - y Vde^ 



(74) 



Substituting here for d the maximal gap g{x) given by the Coniecture 4 (31) we obtain the 
approximate formula for R{x): 



Conjecture 8 



R{x) 



'9(x) 



_l_ Qj-j-Qj- term. 



(75) 



The comparison with real data is given in Figure 12 



The maximum of the function ^x^e~^^ is reached at x = 9 and has the value 0.57971 . . .. 
The maximal value of An is 0.6708735 . . . for d = 4 and second value is 0.6392819 . . . for ci = 14. 
Let us remark that d = 9 is exactly in the middle between 4 and 14. 

Because in (75) R{x) contains exponential of ^Jg{x), it is very sensitive to the form of g{x). 
The substitution g{x) = log^(x) leads to the form: 



R{x) 



log^/^l 



X] 



2yS 



(76) 



This form of R{x) is plotted in Fig, 12 in green. If we will use the guess pf (d) ~ (k56| made 
by D. Shanks then we will get the expression: 



^p^(d) + rf-y^=^de-iv^ 



(77) 



instead of (74). Substitution here for d the form (32) leads to the curve plotted in Fig, 12 
blue. 



m 



Finally, let us remark that from the above analysis it follows that 

lim {^ypn+l 



/p;) = 

The above limit was mentioned on p. 61 in [ISj as a difficult problem (yet unsolved). 



(75 



9 Conclusions 

We have formulated eight conjectures on the gaps between consecutive primes, in particular we 
have expressed maximal gap G{x) directly by tt{x). The guessed formulas are well confirmed 
by existing computer data. The proofs of them seem to be far away and in conclusion we quote 
here the following remarks of R.Penrose from [33], p. 422: 
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10"' -| 
10"^ -1 
10"' -| 
10"^^ 
10 

10"° -| 
10"^ 




R(jc) 



(log(x)) 



10" I iiiiiiiij iiiiiiii| iiiiiiiij iiiiiiiij iiiiiiiij iiiiiiiij iiiiiiiij iiiiiiiij iiiiiiiij iiiiiiiij iiiiiiiij iiiiiiiij iiiiiiiij iiiiiiiij iiiiiiiij iiiiiiiij iiiiiiii| II 
10° 10' 10^ 10' lO'* 10^ 10® 10^ 10' 10^ io'°io" io'^io"io"'io'^io'®io'^io' 

X 



Figure 12: The plot of R{x) and approximations to it given by (75), (76) and (77). The are 75 
maximal gaps available currently and hence there are 75 circles in the plot of R{x). To calculate g{x) 



given by (31) we have used tabulated values of 7r(x) available at the web sites www.trnicely.net and 



www.ieeta.pt/~tos/primes.html. There are over 50 crossings of our formula (75) with R{x). 



Rigorous argument is usually the last step! Before that, one has to make many 
guesses, and for these, aesthetic convictions are enormously important — always 
constrained by logical arguments and known facts. 
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